Goto

Collaborating Authors

 action-value function







Multi-Step Generalized Policy Improvement by Leveraging Approximate Models Lucas N. Alegre 1, 2 Ana L. C. Bazzan 1 Ann Now é 2 Bruno C. da Silva 3 1

Neural Information Processing Systems

We introduce a principled method for performing zero-shot transfer in reinforcement learning (RL) by exploiting approximate models of the environment. Zero-shot transfer in RL has been investigated by leveraging methods rooted in generalized policy improvement (GPI) and successor features (SFs).


Real-Time Reinforcement Learning

Simon Ramstedt, Chris Pal

Neural Information Processing Systems

While it is well suited to describe turn-based decision problems such as board games, this framework is ill suited for real-time applications in which the environment's state continues to evolve while the agent selects an action (Travnik et al., 2018). Nevertheless, this framework hasbeen used forreal-time problems using what areessentially tricks, e.g.